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We present a statistical meclianics treatment of the stability of globular proteins which takes 
explicitly into account the coupling between the protein and water degrees of freedom. This allows 
us to describe both the cold and the warm unfolding, thus qualitatively reproducing the known 
thermodynamics of proteins. 
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The folded conformation of globular proteins is a state of matter peculiar in more than one respect. The density 
is that of a condensed phase (solid or liquid), and the relative position of the atoms is, on average, fixed; these are 
Q . the characteristics of the solid state. However, solids are either crystalline or amorphous, and proteins are neither: 
the folded structure, while ordered in the sense that each molecule of a given species is folded in the same way, lacks 
the translational symmetry of a crystal. In Schrodinger's words, proteins are "aperiodic crystals". Unlike any other 
known solids, globular proteins are not really rigid, being able to perform large conformational motions while retaining 
. locally the same folded structure. Finally, these are mesoscopic systems, consisting of a few thousand atoms. 
I ' Quantitatively, the peculiarities of this state of matter are perhaps best appreciated from the thermodynamics. 
^ I Delicate calorimetric measurements [0-|3| on the folding transition of globular proteins reveals the following picture: 
' first the transition is first order, at least in the case of single domain proteins. Secondly, the stability of the folded 
. state, i.e. the difference in Gibbs potential AG between the unfolded and the folded state is at most a fraction 
^ of kTroora per aminoacid. Following Privalov |3| , we will refer to this property as "cooperativity" . The Gibbs 
I . potential difference AG, as a function of temperature, is non monotonic: it has a maximum around room temperature 
' (where AG > and so the folded form is stable), then crosses zero and becomes negative both for higher and lower 
On temperatures. Correspondingly, the protein unfolds not only at high, but also at low temperatures. This phenomenon 
of "cold unfolding" , which is observed experimentally, is most peculiar: solids usually do not melt upon cooling! For 
temperatures around the cold unfolding transition and below, the enthalpy difference A7J between the unfolded and 
Ch ' the folded state is negative; this means that cold unfolding proceeds with a release of heat (a negative latent heat), 
as is also observed experimentally; at the higher unfolding transition, on the contrary, Ai/ > which corresponds to 
^ the usual situation of a positive latent heat. Fig.l shows Privalovs measurements of the specific heat of myoglobin 
O There are two peaks in the specific heat, corresponding to the two unfolding transitions, and a large gap AG in the 
^ ' specific heat between the unfolded and the folded state. This gap is again peculiar to proteins: usually, for a melting 
J> , transition AG ~ (e.g. for ice at C G = 1.01 cal/gK while for water at C G = 1.00 cal/gK). The existence of 
this gap AG is related to the phenomenon of cold unfolding [|j . 

From the microscopic point of view, the main driving force for folding is the hydrophobic effect. In the native state 
of globular proteins hydrophobic residues are generally found on the inside of the molecule, where they are shielded 
from the water, while hydrophilic residues are typically on the surface. In the following we refer to the difference 
in free energy between hydrophobic residues interacting with each other in the core of the folded protein and these 
same residues interacting with the water in the unfolded structure, including any changes in the microscopic states of 
the water, as the "hydrophobic interaction". Hydrogen bonds within the regular elements of secondary structure (a 
helices and f3 sheets), while necessary for the stability of the native state, can hardly be thought of as providing the 
positive AG of the folded structure, since the unfolded structure would form just as many hydrogen bonds with the 
water. When the protein unfolds, the hydrophobic residues of the interior are exposed; this accounts for most of the 
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gap in the specific heat AC |^ , according to the known effect that dissolving hydrophobic substances in water raises 
the heat capacity of the solution [Q. 

As in other branches of physics, once the thermodynamics of a system is known it is desirable to develop a 
corresponding statistical mechanics picture. Several models have been proposed which address some aspects of the 
folding transition. In the "zipper model" |^ , which was introduced to describe the helix - coil transition, the 
relevant degrees of freedom (conformational angles) are treated as a set of variables which can take two values: one 
corresponding to matching the ordered structure (helix), and the other corresponding to the "coil" state. The problem 
is then equivalent to the 1-D Ising model. A related parametrization for the 3-d folding transition has been proposed 
by Zwanzig describing the folding transition in terms of variables which each awards match with the correct 
ground state. A zipper model that deals with the initial pathway of protein folding has been proposed by Dill, Fiebig 
and Chan [Q . For a review see Q . A recent discussion of hydrophobicity in protein folding is in ref. Q . 

However, to our knowledge no model exists which reproduces all the thermodynamic features surveyed above. With 
the present work, we address this question. 

We start with a Hamiltonian which we have recently introduced to describe self-assembly of a cooperatively 
stabilized (in the sense defined above) structure: 

i? = -((/?! + (/3i(/72 + (,51 (/52</53 H h (/?lV32 • • ■ Vw) - (1) 

here the (^'s are variables which take on the values and 1, and, in the spirit of the zipper model, we define the ground 
state {ipi = 1 V i ) as the template corresponding to the ordered, aperiodic structure, i.e. the folded state. One can 
think of the (p's as appropriately coarse-grained angle variables which define the conformation of the polypeptide 
chain. The above Hamiltonian is then a description of a system that has a specific folding pathway; a property that 
is well documented for proteins ||ll|-|l5|]. As discussed in |10|, this system has a first order phase transition from an 



ordered to a disordered state at temperature Tm = l/ln2. The Hamiltonian (||) exhibits a hierarchical structure 
unless (^1, (/32, • ■ ■ , = 1 , it does not matter what value the rest of the variables '^i+2^ ■ ■ • ^fN assume. As a 
consequence, the system displays cooperativity, in the sense that the binding energy per degree of freedom in the 
ordered state, for T « Tm, is only of order kT^- 

In order to proceed further, it is necessary to take also the water into account. The relevant physics here is that 
dissolving a hydrophobic substance in water causes a large decrease in the entropy of the system |3|. This entropy 
change is attributed to a partial ordering of the water molecules around the hydrophobic solute. The gradual melting 
of this additional structure upon heating causes the increase in heat capacity. Consequently, we introduce a second 
set of variables /ii, /i2, hn which describe the water. These water degrees of freedom couple to the unfolded protein 
degrees of freedom because these expose hydrophobic amino acids to the water. This is achieved by the Hamiltonian: 

H = -(</?! 4- ipiip2 + VlV2^3 H h ipiip2 ■ ■ ■ ^n) ~ [(1 - + (1 - VlV^2)m H 1- (1 - ¥^1(^2 • • • 'PN)^J■N] (2) 

The lyj's take on the values or 1, as before. Each of the /i^'s can take a value from the set £min + sA£, s — 
0, 1, 2, •••,(; — 1} where Smin < , AS > 0. If at least one of the variables ipi, ...,ipi equals zero, the corresponding 
contribution of the i'th water variable to the energy is fj,i and zero otherwise. Therefore when ipi - ■ ■ ipi = 1, the states 
for the corresponding water degree of freedom are degenerate with zero contribution to the energy and degeneracy 
9- 

To reiterate, the physical meaning of this Hamiltonian is that the water molecules in contact with an unfolded 
portion of the protein go to a lower entropy state (compared to the water molecules in contact with a folded portion), 
but also, for low temperatures, to a more tightly bound state. The more specific features of the model (2), e.g. the 
structure of the energy spectrum, the particular coupling of the /i's to the ip^s, etc. can be varied while maintaining the 
overall thermodynamic behavior described below. Here we just present the case which is simplest to solve analytically. 

The calculation of the partition function is straightforward. We parametrize the states of the system by the number 
n of consecutive matches ipi = l,ip2 — 1, (p„ = 1 and ending with fn+i = and the values {s„+i, ...,SAr} where 
each Si € {0, 1, 2, ...,<?— 1} for the {N — n) /i variables coupled to the unfolded portion of the protein. The energy of 
this state is 

N 

e{n,Sn+i,--,SN) ^ -nSo + ^ {S,nin + AS s^) (3) 

i—n+l 

where we have introduced the energy scale So for the protein variable in order to make the formulas dimensionally 
more transparent (up to now we used Sq = 1). Denoting f3 = l/T as the reciprocal temperature, the partition function 
is 
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N-1 g-1 g-1 g-1 

n=0 s„ + i— + sjv— 

In the above equation the factor 2^^"^^ is the degeneracy of the unfolded protein degrees of freedom and the factor 
is the degeneracy of water which is not exposed to the inside of the protein. Factorizing the sums over Si into 
partition functions for each water degree of freedom we write: 

7 ^ tnry ^jv v^V gexp(/3go) V , , t ac \\N tr.\ 
Z = - (2Z„) 2^ I — 1 + (5exp(/3£:o)) (5) 

n— ^ w / 

where the phase space for a water degree of freedom exposed to an unfolded protein degree of freedom is 

„ / fl/o , A^^^ (exp(-/3£:„i„) - exp(-/3£:„aa:) 
= ^^eM~(3{£n..n + sA£)) = (1 _ exp(-/3Ag)) 

where Smax = £min + gA^. From Eq. (H)one sees directly that the state of the system is determined by the size of 
the quantity 

'-^^ - exp(/5A/) (7) 

If A/ > then the system will be in the folded state because the sum in Eq. is dominated by the last term, 
whereas for A/ < the system will be unfolded. 

The sum in Eq. (j^) can be readily performed and the total partition function is 

Z = \{2Z^r \~^r''^%Vlf,;K + ia-MPS.)f (8) 
2 1 - (.gexp(to/3) / (2Z„)) 

The free energy is F = — rin(Z), the energy E = — d\n{Z) / d/3 and the heat capacity C = dE/dT. Because 
there is no pressure in the model, the energy E takes the place of the enthalpy H — E + pV and the free energy 
F = E — TS takes the place of the Gibbs potential G = H — TS. In Fig|| we show the heat capacity per degree 
of freedom for four different choices of £,nim representing four different values of the chemical potential, which we 
discuss later. The characteristic feature is that there are two peaks corresponding to warm and cold unfolding, and a 
gap AC in the heat capacity between the unfolded and the folded form. At higher temperatures, i.e. T > gA£, the 
gap goes to zero because the water becomes effectively degenerate again. In Fig. 3a we show the order parameter (n) 
as function of temperature. The figure indeed confirms that the protein is folded between the two transitions. 

We now calculate explicitly the difference in the thermodynamic functions between the unfolded and the folded 
state. We consider these quantities per degree of freedom. The thermodynamic functions associated to a folded 
(f) protein variable is the energy e/ — —So, the entropy Sf — hi{g) and the free energy // = —So — Tln(g). The 
free energy associated to an unfolded (u) protein variable is given by the corresponding partition function of water 
multiplied by the degeneracy factor of an unfolded part of the protein: = — rin(Z^ 2). The difference in free 
energy between folded and unfolded state is accordingly 

A/ = T In(^^lf^) (9) 

which is the quantity we earlier identified as the one which decides whether the system cooperatively selects the folded 
or the unfolded state. To get a simple insight in this formula we rewrite it for small energy level spacings AS << T: 

Af = So + S,mn + TH^) - T\n{\-e^^{-{S„,ax-S^,n)/T)) (10) 

From this expression for the difference in free energy one easily obtains the corresponding differences in energy, 
entropy and specific heat. In particular, we obtain a gap in the specific heat between the folded and unfolded state 
Ac = {AS/Tflie^^/'^ - If e^^l^ - exp{AS/T) - 1 for temperatures T e [AS,Srmn + S„,ax\, see Fig.|. 

To simplify the discussion let us consider the limit of large Smax ( pO| ) . It is easily seen that Af has a maximum 
at the temperature Tm ~ gAS /2e . The corresponding value of Af is AF{Tm) w {Smin + So) + gAS/2e , so the 
condition for the existence of a region of stability of the ordered structure (A/ > 0) is: 
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> -(fmm +fo)- (11) 

This is of course always satisfied if {Smin + £o) > , however the more interesting situation is {Smin + fo) < 0, since 
then AF < at sufficiently low temperature, i.e. the phenomenon of cold unfolding appears. Under these conditions 
AE is also negative at sufficiently low temperature which means that we have a negative latent heat for cold unfolding. 
Fig. 3b shows these thermodynamic functions. They qualitatively reproduce the known thermodynamic behavior of 
globular proteins as described in the introduction In the present description the mechanism for the transitions is 
the following. At high temperature the entropy gain of the protein causes the unfolding. As temperature is lowered 
the water exposed to hydrophobic parts of the protein gets more and more ordered, and consequently the system 
gains more entropy by shielding the hydrophobic residues from the water (folding). As the temperature is lowered 
even further the cold unfolding transition occurs: here all entropy contributions to free energy are small and the 
dominating effect is the coupling energy between the water and the unfolded protein. 
Coming back to the partition function (3) and (4), we may write: 

N N 

£^-N£o + (7V-n)(fo+f™„)+ ^ A£ s, = - NEo + [Af + f o + ^^mm] (12) 

z— n+1 2— n+1 

and 

^^g/5^£o^' 2JV-n-l^„ ^ + i (^'-'') + g^expiPSoN) (13) 

n=0 {s,} 

where we have set £i = A£ Si , ^ = —{£o + £min)- From this expression for Z we can identify ^ with the chemical 
potential of the water , or, to be more precise, the difference in chemical potential of the water when it is in contact 
with the hydrophobic interior of the protein and when it is not. Therefore, /i > is the physically relevant situation. 
Experimentally, /i can be changed by adding denaturants, changing pH, etc., which indeed alters the stability of the 
ordered structure. The four curves in Fig.|^, which are to be compared with the experimental data in Fig.|l|, are the 
results of the model for different values of the chemical potential /i. 

In conclusion, this paper introduces a new model for the stability of proteins which reproduces the known thermo- 
dynamics. We obtain: 1. first order unfolding transitions; 2. both warm and cold unfolding; 3. cooperativity in the 
sense that the free energy difference stabilizing the folded state is only a fraction of kTroom per degree of freedom; 
4. a qualitatively correct behavior of the specific heat both as a function of temperature and chemical potential; 5. 
a gap in the specific heat between the unfolded and folded state; 6. a negative latent heat for the cold unfolding. A 
deficiency of the model is that our description of the water-protein coupling is simplified. As a result, the two transi- 
tions are too far apart in absolute temperature and in the model the cold unfolding appears sharper than the warm 
unfolding, which is not seen in experiment. This deficiency calls for some modifications, in particular by introducing 
both hydrophobic and hydrophilic ipi 's one can influence the relative strength of the two transitions. 
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FIG. 1. Calorimetric measurements of the specific heat of Myoglobin at four different values of pH, as presented by Privalov 
in ref. At sufficiently low pH the native structure of the protein never becomes stable, thus the protein remains in its 
unfolded structure with approximately constant heat capacity over the measured temperature range. By increasing pH the 
native structure becomes stabilized for intermediate temperatures, defining a transition to an unfolded state at both low and 
high temperatures, denoted respectively cold and warm denaturation. There is a gap in the specific heat between the folded 
and the unfolded states. 

FIG. 2. Specific heat as function of temperature T for the model, with four different values of the chemical potential 
^ = — (£o + £min) = —1.0, — 1.1, — 1.2, — 1.5 and with fixed level spacing A£ — 0.2, <; = 35 and system size A'' = 60. As 
the chemical potential is lowered, it becomes increasingly difficult to fold, and finally for sufficiently low /i the protein stays 
unfolded. 

FIG. 3. Average fraction of folded protein variables as function of temperature for fi = —{So + Smin) ~ —1.0 and the other 
parameters as in figure 1. The figure shows that in between the transitions the protein is folded. 

Fig. 3b. Difference of thermodynamic functions between folded and unfolded configurations for the same chemical potential 
as in a). The difference in free energy A/ has a maximum and becomes negative for both high and low temperature (cold 
unfolding). AS/N and AE/N increase with temperature. AE/N is negative at the cold unfolding transition corresponding to 
a negative latent heat. 
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